03. Dataset: Oral Insulin Phase II Clinical Trial Data
The same Phase II clinical trial dataset for a new oral insulin called Auralin in Lesson 3 ( Assessing Data ) is used here again in Lesson 4 ( Cleaning Data ). The same dataset preview video and disclaimer text are included here, as well.
Dataset Oral Insulin Clinical Trial Data
DISCLAIMER: This Data Isn't "Real"
The Auralin and Novodra are not real insulin products. This clinical trial data was fabricated for the sake of this course. When assessing this data, the issues that you'll detect (and later clean) are meant to simulate real-world data quality and tidiness issues.
That said:
- This dataset was constructed with the consult of real doctors to ensure plausibility.
- This clinical trial data for an alternative insulin was inspired and closely mimics this real clinical trial for an inhaled insulin called Afrezza .
- The data quality issues in this dataset mimic real, common data quality issues in healthcare data . These issues impact quality of care, patient registration, and revenue.
- The patients in this dataset were created using this fake name generator and do not include real names, addresses, phone numbers, emails, etc.
The video above is only a short preview of the dataset that is intended to motivate. If you're not comfortable with the meanings of each column in each table, please revisit the Visual Assessment: Acquaint Yourself page in Lesson 3: Assessing Data . Descriptions of each column as well as the Auralin clinical trial, as a whole, are presented there.